Skip to content

test(recorded): add client cassette coverage (3/5)#1976

Merged
Pouyanpi merged 15 commits into
developfrom
stack/recorded-tests-03-clients
Jun 26, 2026
Merged

test(recorded): add client cassette coverage (3/5)#1976
Pouyanpi merged 15 commits into
developfrom
stack/recorded-tests-03-clients

Conversation

@Pouyanpi

@Pouyanpi Pouyanpi commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds recorded client-level coverage for OpenAI chat and embeddings flows.

Why

Client cassette smoke tests validate that the harness can replay basic provider traffic before the rails API coverage layers on top.

What Changed

  • Adds OpenAI chat cassette coverage.
  • Adds OpenAI embeddings cassette coverage.
  • Adds minimal configs used by the client tests.

Stack Position

Part 3 of 5.

Stack Context

This stack decomposes recorded end-to-end replay coverage into reviewable slices. The PRs should be reviewed against their parent branch in the stack.

Please review each PR against its parent branch, not directly against the root base branch, except for part 1.

Order PR Branch Base
1 #1974 stack/recorded-tests-01-harness develop
2 #1975 stack/recorded-tests-02-deterministic-library-load stack/recorded-tests-01-harness
3 #1976 stack/recorded-tests-03-clients stack/recorded-tests-02-deterministic-library-load
4 #1977 stack/recorded-tests-04-public-api stack/recorded-tests-03-clients
5 #1978 stack/recorded-tests-05-library-rails stack/recorded-tests-04-public-api

Validation

poetry check --lock
poetry lock --no-update
poetry install --with dev
poetry run pytest tests/recorded --block-network -q
pre-commit hooks passed during commit creation

@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 6fec9aa to 9cad57c Compare June 9, 2026 16:22
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from dc3a4a7 to 8000a8e Compare June 9, 2026 16:22
@Pouyanpi Pouyanpi marked this pull request as ready for review June 11, 2026 10:33
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds client-level cassette coverage for OpenAI chat and embeddings, forming part 3 of a 5-PR stack that decomposes recorded end-to-end replay coverage into reviewable slices. All cassettes are correctly sanitized (no secrets, volatile fields normalized) and the VCR fixture plumbing — cassette directory, body matcher, serializer — aligns with the harness established in earlier PRs.

  • Chat test (test_openai_chat.py): async VCR test that constructs OpenAICompatibleClient with an externally-managed httpx.AsyncClient, calls generate_async, and validates the full result shape including request_id and usage tokens.
  • Embeddings test (test_openai_embeddings.py): synchronous VCR test that instantiates OpenAIEmbeddingModel directly and asserts the decoded base64 embedding has the expected 1536-dimensional shape.
  • Config stubs (openai_chat_config/, openai_embeddings_config/): minimal NeMo Guardrails configs added here but not yet loaded by any test in this PR slice — they appear to be pre-positioned for later PRs in the stack.

Confidence Score: 5/5

Safe to merge; the changes are additive test infrastructure with no modifications to production code.

The PR adds new recorded tests, cassettes, and config stubs. Cassettes are properly sanitized, fixture wiring (VCR cassette dir, body matcher, proxy stripping) is inherited correctly from the parent conftest, and both the async chat and sync embeddings paths are exercised end-to-end under network isolation.

The config stubs in openai_chat_config/ and openai_embeddings_config/ are not yet referenced by any test — worth confirming they will be wired up in a later PR in the stack before this stack lands.

Important Files Changed

Filename Overview
tests/recorded/clients/init.py Package init file with license header and docstring; no functional code.
tests/recorded/clients/test_openai_chat.py Async VCR test for OpenAI chat; constructs OpenAICompatibleClient with an externally-managed httpx client, calls generate_async, and validates result shape against the cassette response.
tests/recorded/clients/test_openai_embeddings.py Synchronous VCR test for OpenAI embeddings; directly instantiates OpenAIEmbeddingModel and validates that the decoded embedding vector has the expected dimension (1536).
tests/recorded/clients/cassettes/test_openai_chat/test_openai_chat_generate_text.yaml Cassette for the chat test; volatile fields (response id, system_fingerprint) are correctly normalized and no secrets are present.
tests/recorded/clients/cassettes/test_openai_embeddings/test_openai_embeddings_sync.yaml Cassette for the embeddings test; stores a single base64-encoded 1536-dim embedding; Authorization header is absent as expected after sanitization.
tests/recorded/clients/configs/openai_chat_config/config.yml Minimal passthrough chat config referencing gpt-4o-mini; added in this PR but not yet loaded by any test in this slice of the stack.
tests/recorded/clients/configs/openai_embeddings_config/config.yml Minimal embeddings config for text-embedding-3-small; added in this PR but not yet loaded by any test in this slice of the stack.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Test as Test (pytest)
    participant VCR as VCR Cassette
    participant Client as OpenAICompatibleClient / OpenAIEmbeddingModel
    participant API as OpenAI API (mocked)

    Note over Test,API: Replay mode (--block-network)

    Test->>VCR: load cassette YAML
    VCR-->>Test: cassette ready

    alt Chat test (async)
        Test->>Client: OpenAICompatibleClient(base_url, api_key, http_client)
        Test->>Client: model.generate_async("Say hello in one word")
        Client->>VCR: POST /v1/chat/completions
        VCR-->>Client: "200 { content: "Hello!" }"
        Client-->>Test: GenerationResult
        Test->>Test: assert content, finish_reason, request_id, usage
    else Embeddings test (sync)
        Test->>Client: OpenAIEmbeddingModel("text-embedding-3-small", api_key)
        Test->>Client: model.encode(["test"])
        Client->>VCR: POST /v1/embeddings
        VCR-->>Client: "200 { embedding: base64(...) }"
        Client-->>Test: List[List[float]]
        Test->>Test: "assert len==1, len(result[0])==1536"
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Test as Test (pytest)
    participant VCR as VCR Cassette
    participant Client as OpenAICompatibleClient / OpenAIEmbeddingModel
    participant API as OpenAI API (mocked)

    Note over Test,API: Replay mode (--block-network)

    Test->>VCR: load cassette YAML
    VCR-->>Test: cassette ready

    alt Chat test (async)
        Test->>Client: OpenAICompatibleClient(base_url, api_key, http_client)
        Test->>Client: model.generate_async("Say hello in one word")
        Client->>VCR: POST /v1/chat/completions
        VCR-->>Client: "200 { content: "Hello!" }"
        Client-->>Test: GenerationResult
        Test->>Test: assert content, finish_reason, request_id, usage
    else Embeddings test (sync)
        Test->>Client: OpenAIEmbeddingModel("text-embedding-3-small", api_key)
        Test->>Client: model.encode(["test"])
        Client->>VCR: POST /v1/embeddings
        VCR-->>Client: "200 { embedding: base64(...) }"
        Client-->>Test: List[List[float]]
        Test->>Test: "assert len==1, len(result[0])==1536"
    end
Loading

Reviews (11): Last reviewed commit: "test(recorded): add client cassette cove..." | Re-trigger Greptile

@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 9cad57c to 7310fa7 Compare June 11, 2026 10:52
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from 8000a8e to a253ae2 Compare June 11, 2026 10:52
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 7310fa7 to 6b783f2 Compare June 11, 2026 12:44
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from a253ae2 to 3d1d7ea Compare June 11, 2026 12:44
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 6b783f2 to e1954d4 Compare June 11, 2026 13:18
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch 3 times, most recently from 5153569 to caafb61 Compare June 15, 2026 08:57
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch 2 times, most recently from 7000fcc to 5571204 Compare June 15, 2026 12:00
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from caafb61 to 3136aa1 Compare June 15, 2026 12:00
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 5571204 to fa2ce24 Compare June 15, 2026 14:18
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from 3136aa1 to 06858a2 Compare June 15, 2026 14:18
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from fa2ce24 to 05def3f Compare June 16, 2026 08:36
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from 06858a2 to 474d54d Compare June 16, 2026 08:36
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 05def3f to f912ae5 Compare June 17, 2026 12:17
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from 474d54d to 36679f9 Compare June 17, 2026 12:17
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from f912ae5 to e5a52c6 Compare June 22, 2026 14:09
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from 36679f9 to d052d99 Compare June 22, 2026 14:09
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from d052d99 to 3732314 Compare June 23, 2026 10:16
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from e5a52c6 to 14ddbc7 Compare June 23, 2026 10:16

@tgasser-nv tgasser-nv left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a few cleanups needed before merging.

Not blocking for this PR since it's the initial implementation, but can you add the following in a follow-on PR (can be outside this stack) to round out the coverage:

  • Streaming inference
  • Tool-calling

Comment thread tests/recorded/clients/configs/openai_chat_config/config.yml Outdated
Comment thread tests/recorded/clients/configs/openai_embeddings_config/config.yml Outdated
Comment thread tests/recorded/clients/test_openai_chat.py Outdated
Comment thread tests/recorded/clients/test_openai_embeddings.py Outdated
Comment thread tests/recorded/clients/test_openai_chat.py Outdated
Pouyanpi added 12 commits June 26, 2026 09:12
Foundation for converging the recorded suite's cross-surface drift, consumed by the
public_api and library layers above:

- rails/helpers.py: shared build_rails() construction helper + async_chunks()
  (replaces the LLMRails(load_config(...)) boilerplate inlined per test, D11/F).
- assertions.py: assert_blocked_generation() asserts refusal + rail stop semantics,
  not just non-empty text (D6).
Replay under --block-network must not depend on ambient proxy env: a SOCKS
proxy makes httpx raise ImportError (missing socksio) on a cassette hit,
turning a deterministic replay into a shell-dependent error. Add an autouse
fixture that strips proxy vars during replay (record_mode == none) while
leaving them intact for recording.

Also fix the README 'Adding a test' snippet to include the imports it relies
on (LLMRails, load_config, suite-local snapshot, OPENAI_BASELINE_CONFIG) so a
new contributor can copy-paste it and land on the intended snapshot re-export.
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 14ddbc7 to 82429ff Compare June 26, 2026 07:25
Pouyanpi added 2 commits June 26, 2026 09:29
Adds the library-traversal sibling of
test_load_prompts_sorts_files_for_deterministic_overrides, addressing the
stack-2 review ask. Mocks os.walk to yield two library .co files defining the
same bot message in non-sorted order and asserts the alphabetically-first
file wins the collision, pinning the dirs.sort()/sorted(files) fix in
LLMRails.__init__ so library load order stays filesystem-independent.
Base automatically changed from stack/recorded-tests-02-deterministic-library-load to develop June 26, 2026 07:41
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-03-clients branch from 3732314 to 0c42208 Compare June 26, 2026 07:49
@github-actions

Copy link
Copy Markdown
Contributor

@Pouyanpi Pouyanpi merged commit 9b1f43b into develop Jun 26, 2026
14 checks passed
@Pouyanpi Pouyanpi deleted the stack/recorded-tests-03-clients branch June 26, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants